Compressing Very Large Database Workloads for Continuous Online Index Selection
نویسنده
چکیده
The paper presents a novel method for compressing large database workloads for purpose of autonomic, continuous index selection. The compressed workload contains a small subset of representative queries from the original workload. A single pass clustering algorithm with a simple and elegant selectivity based query distance metric guarantees low memory and time complexity. Experiments on two real-world database workloads show the method achieves high compression ratio without decreasing the quality of the index selection problem solutions.
منابع مشابه
Analysis of the Characteristics of Production Database Workloads and Comparison with the TPC Benchmarks
There has been very little empirical analysis of any real production database workloads. Although The Transaction Processing Performance Council benchmarks C (TPC-C) and D (TPC-D) have become the standard benchmarks for online transaction processing and decision support systems respectively, there has also not been any major effort to systematically analyze their workload characteristics, espec...
متن کاملHCS: Hierarchical Cut Selection for Efficiently Processing Queries on Data Columns using Hierarchical Bitmap Indices
When data are large and query processing workloads consist of data selection and aggregation operations (as in online analytical processing), column-oriented data stores are generally the preferred choice of data organization, because they enable effective data compression, leading to significantly reduced IO. Most columnstore architectures leverage bitmap indices, which themselves can be compr...
متن کاملCharacteristics of production database workloads and the TPC benchmarks
There has been very little empirical analysis of any real production database workloads. Although the Transaction Processing Performance Council benchmarks C (TPC-C) and D (TPC-D) have become the standard benchmarks for on-line transaction processing and decision support systems, respectively, there has not been any major effort to systematically analyze their workload characteristics, especial...
متن کاملAutomated Performance Tuning of Data Management Systems with Materializations and Indices
Automated performance tuning of data management systems offer various benefits such as improved performance, declined administration costs, and reduced workloads to database administrators (DBAs). Currently, DBAs tune the performance of database systems with a little help from the database servers. In this paper, we propose a new technique for automated performance tuning of data management sys...
متن کاملThe Case for Automatic Database Administration using Deep Reinforcement Learning
Like any large software system, a full-fledged DBMS offers an overwhelming amount of configuration knobs. These range from static initialisation parameters like buffer sizes, degree of concurrency, or level of replication to complex runtime decisions like creating a secondary index on a particular column or reorganising the physical layout of the store. To simplify the configuration, industry g...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008